dry bean
Automated Classification of Dry Bean Varieties Using XGBoost and SVM Models
This paper presents a comparative study on the automated classification of seven different varieties of dry beans using machine learning models. Leveraging a dataset of 12,909 dry bean samples, reduced from an initial 13,611 through outlier removal and feature extraction, we applied Principal Component Analysis (PCA) for dimensionality reduction and trained two multiclass classifiers: XGBoost and Support Vector Machine (SVM). The models were evaluated using nested cross-validation to ensure robust performance assessment and hyperparameter tuning. The XGBoost and SVM models achieved overall correct classification rates of 94.00% and 94.39%, respectively. The results underscore the efficacy of these machine learning approaches in agricultural applications, particularly in enhancing the uniformity and efficiency of seed classification. This study contributes to the growing body of work on precision agriculture, demonstrating that automated systems can significantly support seed quality control and crop yield optimization. Future work will explore incorporating more diverse datasets and advanced algorithms to further improve classification accuracy.
Adaptive boosting with dynamic weight adjustment
Mangina, Vamsi Sai Ranga Sri Harsha
Adaptive Boosting with Dynamic Weight complex relationships among the data, we can use Adaptive Boosting with Dynamic Weight Adjustment. Adjustment is an enhancement of the traditional Adaptive Adaptive Boosting with Dynamic Weight Adjustment is an boosting commonly known as AdaBoost, a powerful enhancement of the traditional AdaBoost technique where ensemble learning technique. Adaptive Boosting with the weight updation process in Adaptive Boosting with Dynamic Weight Adjustment technique improves the Dynamic Weight Adjustment is more adaptive by taking efficiency and accuracy by dynamically updating the classification errors and the overall error distribution and weights of the instances based on prediction error where the based on the individual instances. This enables our model weights are updated in proportion to the error rather than to work with multiclass and more complex data efficiently, updating weights uniformly as we do in traditional enhancing the performance and its efficiency compared to Adaboost.
- Research Report (1.00)
- Overview (0.69)
A Novel Metric for Measuring Data Quality in Classification Applications (extended version)
Roxane, Jouseau, Sébastien, Salva, Chafik, Samir
Data quality is a key element for building and optimizing good learning models. Despite many attempts to characterize data quality, there is still a need for rigorous formalization and an efficient measure of the quality from available observations. Indeed, without a clear understanding of the training and testing processes, it is hard to evaluate the intrinsic performance of a model. Besides, tools allowing to measure data quality specific to machine learning are still lacking. In this paper, we introduce and explain a novel metric to measure data quality. This metric is based on the correlated evolution between the classification performance and the deterioration of data. The proposed method has the major advantage of being model-independent. Furthermore, we provide an interpretation of each criterion and examples of assessment levels. We confirm the utility of the proposed metric with intensive numerical experiments and detail some illustrative cases with controlled and interpretable qualities.
An Empirical Evaluation of the Rashomon Effect in Explainable Machine Learning
Müller, Sebastian, Toborek, Vanessa, Beckh, Katharina, Jakobs, Matthias, Bauckhage, Christian, Welke, Pascal
The Rashomon Effect describes the following phenomenon: for a given dataset there may exist many models with equally good performance but with different solution strategies. The Rashomon Effect has implications for Explainable Machine Learning, especially for the comparability of explanations. We provide a unified view on three different comparison scenarios and conduct a quantitative evaluation across different datasets, models, attribution methods, and metrics. We find that hyperparameter-tuning plays a role and that metric selection matters. Our results provide empirical support for previously anecdotal evidence and exhibit challenges for both scientists and practitioners.
- Europe > Austria > Vienna (0.14)
- North America > United States > Wisconsin (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- (2 more...)